Semi-Supervised Contextual Word Sense Disambiguation for Data Augmentation
نویسندگان
چکیده
Yapay zekâ alan?nda son dönemlerde öne ç?kan derin ö?renme mimarilerinin, do?al dil i?leme konusunun önemli problemlerinden biri olan Anlam Belirsizli?i Giderme (ABG) çal??malar?nda kayda de?er geli?melere yol açt??? gözlemlenmektedir. Denetimli yöntemler rakiplerine göre daha yüksek performans sergilemektedirler. Bunun en büyük nedeni kullan?lan e?itim verilerinin büyüklükleridir. ABG problemi için ?ngilizce dili üzerinde elle-etiketlenmi? çok miktarda veri çevrim içi olarak eri?ilebilir durumdad?r. Ancak dü?ük-kaynakl? diller (DKD’ler) probleme uygun eksikli?i ya?amaktad?rlar. Yeterli derecede toplamak ve etiketlemek vakit al?c? maliyet gerektiren bir i?tir. Bu de?inmek a?mak üzere, bu çal??mada yar?-denetimli ba?lamsal anlam belirsizli?i giderme yakla??m?n?n art?r?m? (daha sonra denetimli ö?renmede verisi kullan?lmak üzere) kullan?labilece?inin gösterilmesi amaçlanm??t?r. ba?lamda özellikle DKD’lerde test bulman?n zor olmas? nedeniyle yakla??m?n do?rulu?unu ilerleyen kullan?labilirli?ini ispatlamak amac?yla çevrimiçi bulunan kullan?lm??t?r. Olu?turulan yöntemde öbek kümesi (seed set) ba?lam vektörleri (context embeddings) kullan?lmaktad?r. Yap?lan çal??ma 9 farkl? modelinde (ELMo, BERT, RoBERTa vb.) edilmi? her modelinin üzerindeki etkileri raporlanm??t?r. ?lk temel yakla??ma sonuçlar %28 do?ruluk oran?nda art??? sa?lanm??t?r. (ELMo ile ilk yakla??m %50,39 ELMo Öbek Esasl? Ortalama Benzerlik Modeli %78,06). Al?nan sonuçlara neticesinde, önerilen DKD’ler yönelik olu?turmak gelecek vaat eden etti?i gösterilmi?tir. makale [18]’deki çal??mam?z?n geni?letilmi? versiyonudur.
منابع مشابه
Word Sense Disambiguation with Semi-Supervised Learning
Current word sense disambiguation (WSD) systems based on supervised learning are still limited in that they do not work well for all words in a language. One of the main reasons is the lack of sufficient training data. In this paper, we investigate the use of unlabeled training data for WSD, in the framework of semi-supervised learning. Four semisupervised learning algorithms are evaluated on 2...
متن کاملWord Sense Disambiguation by Semi-supervised Learning
In this paper we propose to use a semi-supervised learning algorithm to deal with word sense disambiguation problem. We evaluated a semi-supervised learning algorithm, local and global consistency algorithm, on widely used benchmark corpus for word sense disambiguation. This algorithm yields encouraging experimental results. It achieves better performance than orthodox supervised learning algor...
متن کاملSemi-Supervised Learning for Word Sense Disambiguation: Quality vs. Quantity
In this paper, we discuss the importance of the quality against the quantity of automatically extracted examples for word sense disambiguation (WSD). We first show that we can build a competitive WSD system with a memory-based classifier and a feature set reduced to easily and efficiently computable features. We then show that adding automatically annotated examples improves the performance of ...
متن کاملReview: Semi-Supervised Learning Methods for Word Sense Disambiguation
Word sense disambiguation (WSD) is an open problem of natural language processing, which governs the process of identifying the appropriate sense of a word in a sentence, when the word has multiple meanings. Many approaches have been proposed to solve the problem, of which supervised learning approaches are the most successful. However supervised machine learning are limited by the difficulties...
متن کاملInvestigating Problems of Semi-supervised Learning for Word Sense Disambiguation
Word Sense Disambiguation (WSD) is the problem of determining the right sense of a polysemous word in a given context. In this paper, we will investigate the use of unlabeled data for WSD within the framework of semi supervised learning, in which the original labeled dataset is iteratively extended by exploiting unlabeled data. This paper addresses two problems occurring in this approach: deter...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Tbv bilgisayar bilimleri ve mühendisli?i dergisi
سال: 2021
ISSN: ['1305-8991']
DOI: https://doi.org/10.54525/tbbmd.835744